# ImageNet fine-tuning
Convnextv2 Tiny.fcmae
A self-supervised feature representation model based on ConvNeXt-V2, pre-trained using the Fully Convolutional Masked Autoencoder (FCMAE) framework, suitable for image feature extraction and fine-tuning tasks.
Image Classification
Transformers

C
timm
2,463
1
Data2vec Vision Base Ft1k
Apache-2.0
Data2Vec-Vision is a self-supervised learning model based on the BEiT architecture, fine-tuned on the ImageNet-1k dataset, suitable for image classification tasks.
Image Classification
Transformers

D
facebook
7,520
2
Data2vec Vision Large Ft1k
Apache-2.0
Data2Vec-Vision is a self-supervised learning vision model based on the BEiT architecture, fine-tuned on the ImageNet-1k dataset, suitable for image classification tasks.
Image Classification
Transformers

D
facebook
68
5
Regnet Y 1280 Seer In1k
Apache-2.0
RegNet image classification model trained on ImageNet-1k using self-supervised pretraining and fine-tuning methods
Image Classification
Transformers

R
facebook
18
1
Regnet Y 640 Seer In1k
Apache-2.0
RegNet model trained on imagenet-1k, pre-trained in a self-supervised manner on billions of random web images before fine-tuning
Image Classification
Transformers

R
facebook
21
0
Vit Large Patch32 384
Apache-2.0
This Vision Transformer (ViT) model is pre-trained on the ImageNet-21k dataset and then fine-tuned on the ImageNet dataset, suitable for image classification tasks.
Image Classification
V
google
118.37k
16
Vit Base Patch32 384
Apache-2.0
Vision Transformer (ViT) is an image classification model based on the Transformer architecture, achieving efficient image recognition capabilities through pre-training and fine-tuning on the ImageNet-21k and ImageNet datasets.
Image Classification
V
google
24.92k
20
Beit Large Patch16 512
Apache-2.0
BEiT is a vision Transformer-based image classification model, pre-trained in a self-supervised manner on ImageNet-21k and fine-tuned on ImageNet-1k.
Image Classification
B
microsoft
683
11
Beit Base Patch16 224
Apache-2.0
BEiT is a vision model based on image transformers, employing a BERT-like self-supervised pre-training method. It is first pre-trained and fine-tuned on ImageNet-22k, then further fine-tuned on ImageNet-1k.
Image Classification
B
nielsr
28
0
Vit Base Patch16 384
Apache-2.0
Vision Transformer (ViT) is an image classification model based on the Transformer architecture, pre-trained on ImageNet-21k and fine-tuned on ImageNet.
Image Classification
V
google
30.30k
38
Vit Large Patch16 384
Apache-2.0
Vision Transformer (ViT) is an image classification model based on the transformer architecture, pre-trained on ImageNet-21k and fine-tuned on ImageNet.
Image Classification
V
google
161.29k
12
Featured Recommended AI Models